Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning with Large Datasets

This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptions, the costs of these operations can be shown to be independent of the number of...

متن کامل

Authoritative Citation KNN Learning with Noisy Training Datasets

In this paper, we investigate the effectiveness of Citation K-Nearest Neighbors (KNN) learning with noisy training datasets. We devise an authority measure associated with each training instance that changes based on the outcome of Citation KNN classification. The authority is increased when a citer’s classification had been right; and vice versa. We show that by modifying only these authority ...

متن کامل

For Incomplete Datasets

In this study, we compare the performance of four different imputation strategies ranging from the commonly used Listwise Deletion to model based approaches such as the Max19 imum Likelihood on enhancing completeness in incomplete software project data sets. We evaluate the impact of each of these methods by implementing them on six different 21 real-time software project data sets which are cl...

متن کامل

Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets

This paper introduces new algorithms and data st.ruct,ures for quick rounting for machine learning dat.asets. We focus on t,he counting task of constructing contingent:. t.ables, but our approach is also applicahlc t.o counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptionsl t h c rosts of thesr operations ca,n he shown to be independent of the...

متن کامل

Cached Suucient Statistics for Eecient Machine Learning with Large Datasets

This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptions, the costs of these operations can be shown to be independent of the number of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Sciences

سال: 2020

ISSN: 2076-3417

DOI: 10.3390/app10238481